Goto

Collaborating Authors

 prediction region


CONTRA: Conformal Prediction Region via Normalizing Flow Transformation

arXiv.org Machine Learning

Density estimation and reliable prediction regions for outputs are crucial in supervised and unsupervised learning. While conformal prediction effectively generates coverage-guaranteed regions, it struggles with multi-dimensional outputs due to reliance on one-dimensional nonconformity scores. To address this, we introduce CONTRA: CONformal prediction region via normalizing flow TRAnsformation. CONTRA utilizes the latent spaces of normalizing flows to define nonconformity scores based on distances from the center. This allows for the mapping of high-density regions in latent space to sharp prediction regions in the output space, surpassing traditional hyperrectangular or elliptical conformal regions. Further, for scenarios where other predictive models are favored over flow-based models, we extend CONTRA to enhance any such model with a reliable prediction region by training a simple normalizing flow on the residuals. We demonstrate that both CONTRA and its extension maintain guaranteed coverage probability and outperform existing methods in generating accurate prediction regions across various datasets. We conclude that CONTRA is an effective tool for (conditional) density estimation, addressing the under-explored challenge of delivering multi-dimensional prediction regions.


TRACE: Transport Alignment Conformal Prediction via Diffusion and Flow Matching Models

arXiv.org Machine Learning

Constructing valid and informative conformal prediction regions for multi-dimensional outputs remains a fundamental challenge. While conformal prediction provides finite-sample, distribution-free coverage guarantees, its practical performance critically depends on the choice of nonconformity score. Existing approaches often rely on restrictive geometric assumptions or require explicit likelihood evaluation and invertible transformations, limiting their applicability in complex generative settings. In this work, we introduce TRACE (TRansport Alignment Conformal Estimation), a conformal prediction framework that defines nonconformity through transport alignment in diffusion and flow matching models. Rather than evaluating likelihoods, we measure how well a candidate output aligns with the learned generative dynamics by averaging denoising or velocity-matching errors along stochastic transport trajectories. The resulting transport-based scores are scalar-valued and can be calibrated using split conformal prediction, yielding valid marginal coverage under exchangeability. We further analyze the statistical properties of the proposed scores and their sensitivity to computational budget. Experiments on synthetic and real datasets demonstrate valid coverage and show that the resulting regions adapt naturally to multimodal and non-convex conditional distributions.


A Kernel Nonconformity Score for Multivariate Conformal Prediction

arXiv.org Machine Learning

Multivariate conformal prediction requires nonconformity scores that compress residual vectors into scalars while preserving certain implicit geometric structure of the residual distribution. We introduce a Multivariate Kernel Score (MKS) that produces prediction regions that explicitly adapt to this geometry. We show that the proposed score resembles the Gaussian process posterior variance, unifying Bayesian uncertainty quantification with the coverage guarantees of frequentist-type. Moreover, the MKS can be decomposed into an anisotropic Maximum Mean Discrepancy (MMD) that interpolates between kernel density estimation and covariance-weighted distance. We prove finite-sample coverage guarantees and establish convergence rates that depend on the effective rank of the kernel-based covariance operator rather than the ambient dimension, enabling dimension-free adaptation. On regression tasks, the MKS reduces the volume of prediction region significantly, compared to ellipsoidal baselines while maintaining nominal coverage, with larger gains at higher dimensions and tighter coverage levels.


Conformal Robust Set Estimation

arXiv.org Machine Learning

Conformal prediction provides finite-sample, distribution-free coverage under exchangeability, but standard constructions may lack robustness in the presence of outliers or heavy tails. We propose a robust conformal method based on a non-conformity score defined as the half-mass radius around a point, equivalently the distance to its $(\lfloor n/2\rfloor+1)$-nearest neighbour. We show that the resulting conformal regions are marginally valid for any sample size and converge in probability to a robust population central set defined through a distance-to-a-measure functional. Under mild regularity conditions, we establish exponential concentration and tail bounds that quantify the deviation between the empirical conformal region and its population counterpart. These results provide a probabilistic justification for using robust geometric scores in conformal prediction, even for heavy-tailed or multi-modal distributions.



Approximate full conformal prediction in an RKHS

arXiv.org Machine Learning

Full conformal prediction is a framework that implicitly formulates distribution-free confidence prediction regions for a wide range of estimators. However, a classical limitation of the full conformal framework is the computation of the confidence prediction regions, which is usually impossible since it requires training infinitely many estimators (for real-valued prediction for instance). The main purpose of the present work is to describe a generic strategy for designing a tight approximation to the full conformal prediction region that can be efficiently computed. Along with this approximate confidence region, a theoretical quantification of the tightness of this approximation is developed, depending on the smoothness assumptions on the loss and score functions. The new notion of thickness is introduced for quantifying the discrepancy between the approximate confidence region and the full conformal one.


Conformal Prediction for Compositional Data

arXiv.org Machine Learning

In this work, we propose a set of conformal prediction procedures tailored to compositional responses, where outcomes are proportions that must be positive and sum to one. Building on Dirichlet regression, we introduce a split conformal approach based on quantile residuals and a highest-density region strategy that combines a fast coordinate-floor approximation with an internal grid refinement to restore sharpness. Both constructions are model-agnostic at the conformal layer and guarantee finite-sample marginal coverage under exchangeability, while respecting the geometry of the simplex. A comprehensive Monte Carlo study spanning homoscedastic and heteroscedastic designs shows that the quantile residual and grid-refined HDR methods achieve empirical coverage close to the nominal 90\% level and produce substantially narrower regions than the coordinate-floor approximation, which tends to be conservative. We further demonstrate the methods on household budget shares from the BudgetItaly dataset, using standardized socioeconomic and price covariates with a train, calibration, and test split. In this application, the grid-refined HDR attains coverage closest to the target with the smallest average widths, closely followed by the quantile residual approach, while the simple triangular HDR yields wider, less informative sets. Overall, the results indicate that conformal prediction on the simplex can be both calibrated and efficient, providing practical uncertainty quantification for compositional prediction tasks.


Conformal Prediction in The Loop: A Feedback-Based Uncertainty Model for Trajectory Optimization

arXiv.org Artificial Intelligence

Conformal Prediction (CP) is a powerful statistical machine learning tool to construct uncertainty sets with coverage guarantees, which has fueled its extensive adoption in generating prediction regions for decision-making tasks, e.g., Trajectory Optimization (TO) in uncertain environments. However, existing methods predominantly employ a sequential scheme, where decisions rely unidirectionally on the prediction regions, and consequently the information from decision-making fails to be fed back to instruct CP. In this paper, we propose a novel Feedback-Based CP (Fb-CP) framework for shrinking-horizon TO with a joint risk constraint over the entire mission time. Specifically, a CP-based posterior risk calculation method is developed by fully leveraging the realized trajectories to adjust the posterior allowable risk, which is then allocated to future times to update prediction regions. In this way, the information in the realized trajectories is continuously fed back to the CP, enabling attractive feedback-based adjustments of the prediction regions and a provable online improvement in trajectory performance. Furthermore, we theoretically prove that such adjustments consistently maintain the coverage guarantees of the prediction regions, thereby ensuring provable safety. Additionally, we develop a decision-focused iterative risk allocation algorithm with theoretical convergence analysis for allocating the posterior allowable risk which closely aligns with Fb-CP. Furthermore, we extend the proposed method to handle distribution shift. The effectiveness and superiority of the proposed method are demonstrated through benchmark experiments.


On some practical challenges of conformal prediction

arXiv.org Machine Learning

Conformal prediction is a model-free machine learning method for creating prediction regions with a guaranteed coverage probability level. However, a data scientist often faces three challenges in practice: (i) the determination of a conformal prediction region is only approximate, jeopardizing the finite-sample validity of prediction, (ii) the computation required could be prohibitively expensive, and (iii) the shape of a conformal prediction region is hard to control. This article offers new insights into the relationship among the monotonicity of the non-conformity measure, the monotonicity of the plausibility function, and the exact determination of a conformal prediction region. Based on these new insights, we propose a simple strategy to alleviate the three challenges simultaneously.


Conformal Inference for Time Series over Graphs

arXiv.org Artificial Intelligence

Trustworthy decision making in networked, dynamic environments calls for innovative uncertainty quantification substrates in predictive models for graph time series. Existing conformal prediction (CP) methods have been applied separately to multivariate time series and static graphs, but they either ignore the underlying graph topology or neglect temporal dynamics. To bridge this gap, here we develop a CP-based sequential prediction region framework tailored for graph time series. A key technical innovation is to leverage the graph structure and thus capture pairwise dependencies across nodes, while providing user-specified coverage guarantees on the predictive outcomes. We formally establish that our scheme yields an exponential shrinkage in the volume of the ellipsoidal prediction set relative to its graph-agnostic counterpart. Using real-world datasets, we demonstrate that the novel uncertainty quantification framework maintains desired empirical coverage while achieving markedly smaller (up to 80% reduction) prediction regions than existing approaches.